Evolving Stochastic Context-Free Grammars from Examples Using a Minimum Description Length Principle
نویسنده
چکیده
This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from nite language samples. The approach employs a genetic algorithm, with a tness function derived from a minimum description length principle. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. We provide details of our tness function for grammars and present the results of a number of experiments in learning grammars for a range of formal languages.
منابع مشابه
Learning Stochastic Categorial Grammars
Stochastic categorial grammars (SCGs) are introduced as a more appropriate formalism for statistical language learners to est imate than stochastic context free grammars. As a vehicle for demonstrating SCG estimation, we show, in terms of crossing rates and in coverage, that when training material is limited, SCG estimation using the Minimum Description Length Principle is preferable to SCG est...
متن کاملLearning the Grammar of Human Activity from Video
Stochastic Context-Free Grammars (SCFG) have been shown to be useful for applications beyond natural language analysis, specifically vision-based human activity analysis. Vision-based symbol strings differ from natural language strings, in that a string of symbols produced by video often times contains noise symbols, making grammatical inference very difficult. In order to obtain reliable resul...
متن کاملAn MDL Approach to Learning Activity Grammars
Stochastic Context-Free Grammars (SCFG) have been shown to be useful for vision-based human activity analysis. However, action strings from vision-based systems differ from word strings, in that a string of symbols produced by video contains noise symbols, making grammar learning very difficult. In order to learn the basic structure of human activities, it is necessary to filter out these noise...
متن کاملUnsupervised induction of stochastic context-free grammars using distributional clustering
An algorithm is presented for learning a phrase-structure grammar from tagged text. It clusters sequences of tags together based on local distributional information, and selects clusters that satisfy a novel mutual information criterion. This criterion is shown to be related to the entropy of a random variable associated with the tree structures, and it is demonstrated that it selects linguisti...
متن کاملLearning context-free grammars to extract relations from text
In this paper we propose a novel relation extraction method, based on grammatical inference. Following a semisupervised learning approach, the text that connects named entities in an annotated corpus is used to infer a context free grammar. The grammar learning algorithm is able to infer grammars from positive examples only, controlling overgeneralisation through minimum description length. Eva...
متن کامل